Robust Example Search Using Bottleneck Features for Example-Based Speech Enhancement

نویسندگان

  • Atsunori Ogawa
  • Shogo Seki
  • Keisuke Kinoshita
  • Marc Delcroix
  • Takuya Yoshioka
  • Tomohiro Nakatani
  • Kazuya Takeda
چکیده

Example-based speech enhancement is a promising approach for coping with highly non-stationary noise. Given a noisy speech input, it first searches in noisy speech corpora for the noisy speech examples that best match the input. Then, it concatenates the clean speech examples that are paired with the matched noisy examples to obtain an estimate of the underlying clean speech component in the input. This framework works well if the noisy speech corpora contain the noise included in the input. However, it is impossible to prepare corpora that cover all types of noisy environments. Moreover, the example search is usually performed using noise sensitive mel-frequency cepstral coefficient features (MFCCs). Consequently, a mismatch between an input and the corpora is inevitable. This paper proposes using bottleneck features (BNFs) extracted from a deep neural network (DNN) acoustic model for the example search. Since BNFs have good noise robustness (invariance), the mismatch is mitigated and thus a more accurate example search can be performed. Experimental results on the Aurora4 corpus show that the example-based approach using BNFs greatly improves the enhanced speech quality compared with that using MFCCs. It also consistently outperforms a conventional DNN-based approach, i.e. a denoising autoencoder.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improved Example-Based Speech Enhancement by Using Deep Neural Network Acoustic Model for Noise Robust Example Search

Example-based speech enhancement is a promising singlechannel approach for coping with highly nonstationary noise. Given a noisy speech input, it first searches in a noisy speech corpus for the noisy speech examples that best match the input. Then, it concatenates the clean speech examples that are paired with the matched noisy examples to obtain an estimate of the underlying clean speech compo...

متن کامل

An Information-Theoretic Discussion of Convolutional Bottleneck Features for Robust Speech Recognition

Convolutional Neural Networks (CNNs) have been shown their performance in speech recognition systems for extracting features, and also acoustic modeling. In addition, CNNs have been used for robust speech recognition and competitive results have been reported. Convolutive Bottleneck Network (CBN) is a kind of CNNs which has a bottleneck layer among its fully connected layers. The bottleneck fea...

متن کامل

Speech Enhancement using Adaptive Data-Based Dictionary Learning

In this paper, a speech enhancement method based on sparse representation of data frames has been presented. Speech enhancement is one of the most applicable areas in different signal processing fields. The objective of a speech enhancement system is improvement of either intelligibility or quality of the speech signals. This process is carried out using the speech signal processing techniques ...

متن کامل

Toward High-Performance Language-Independent Query-by-Example Spoken Term Detection for MediaEval 2015: Post-Evaluation Analysis

This paper documents the significant components of a state-ofthe-art language-independent query-by-example spoken term detection system designed for the Query by Example Search on Speech Task (QUESST) in MediaEval 2015. We developed exact and partial matching DTW systems, and WFST based symbolic search systems to handle different types of search queries. To handle the noisy and reverberant spee...

متن کامل

The IIT-B Query-by-Example System for MediaEval 2015

This paper describes the system developed at I.I.T. Bombay for Query-by-Example Search on Speech Task (QUESST) within the MediaEval 2015 evaluation framework. Our system preprocesses the data to remove noise and performs subsequence DTW on posterior/bottleneck features obtained using four phone recognition systems to detect the queries. Scores from each of these subsystems are fused to get the ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016